fix: sticky sessions by thushan · Pull Request #140 · thushan/olla

thushan · 2026-04-23T02:00:55Z

This PR addresses some limitations in the sticky sessions that broke post fixes in v0.0.26.

Summary by CodeRabbit

New Features
- Sticky sessions with configurable idle TTL and high-concurrency limits; keys can be derived from session headers or request prefixes.
- Anthropic Messages passthrough support in OpenAI-compatible profiles (disabled by default).
Improvements
- Provider routing now preserves caller spelling/aliasing when forwarding requests.
Tests
- New manual end-to-end sticky-session test harness and Makefile targets with Docker-backed mock backends for local verification.

coderabbitai · 2026-04-23T02:01:06Z

Walkthrough

Adds end-to-end sticky-session testing infrastructure and related runtime changes: sticky-key derivation fixes, handler integration to inject keys, registry routing propagation, AIMock-backed Docker fixtures and compose, Makefile targets, and comprehensive shell test scripts and documentation for manual/automated sticky-session validation.

Changes

Cohort / File(s)	Summary
Core balancer & key logic `internal/adapter/balancer/sticky.go`	Derive prefix-hash sticky key from legacy `prompt` when `messages` is absent/empty; preserve previous key format and empty-source behavior.
Handler changes `internal/app/handlers/handler_common.go`, `internal/app/handlers/handler_provider_common.go`	Add `getRawProviderPrefix`; provider proxy derives strip prefix from raw path and injects sticky key after request analysis (before selection) when enabled.
Discovery / registry `internal/app/services/discovery.go`	Forward operator-configured `RoutingStrategy` into registry config passed to model registry.
Tests (unit & metrics) `internal/adapter/balancer/sticky_metrics_test.go`, `internal/adapter/balancer/sticky_test.go`, `internal/app/handlers/handler_provider_test.go`	New tests for sticky key computation and wrapper metrics; TTL expiry test adjusted to avoid sliding refresh; provider-proxy handler tests added with test doubles capturing sticky outcome.
Manual test config & profiles `test/manual/config.sticky.yaml`, `config/profiles/openai-compatible.yaml`	Add manual sticky test profile (static discovery, sherpa proxy, sticky params, Anthropic translator passthrough configured) and add `api.anthropic_support` block to OpenAI-compatible profile.
Docker mock fixtures & compose `test/mock/compose.yaml`, `test/mock/fixtures/instance-a.json`, `test/mock/fixtures/instance-b.json`, `test/mock/fixtures/instance-c.json`	Compose to start three AIMock instances (ports 9300–9302) with per-instance fixtures that return identifiable responses and streaming profiles for affinity testing.
Test scripts & orchestration `test/scripts/sticky/run-manual.sh`, `test/scripts/sticky/test-sticky-provider-routes.sh`, `makefile`	Makefile targets to manage mocks and run manual test; orchestration script to start AIMock + Olla and teardown; regression script asserting miss→hit, backend consistency, turn‑3 diversity and sticky stats.
Documentation / skill `.claude/skills/test-sticky-sessions.md`	New Claude skill doc describing end-to-end sticky-session manual test procedure, expected outputs and curl checks.

Sequence Diagram(s)

sequenceDiagram
    participant Tester
    participant Olla as Olla (proxy)
    participant Balancer as StickyBalancer
    participant AIMock as AIMock Backends
    participant Registry as ModelRegistry

    Tester->>Olla: POST /olla/<provider>/... (X-Olla-Session-ID or prompt/messages)
    Olla->>Olla: analyzeRequest (populate model)
    Olla->>Balancer: ComputeStickyKey (session_header / prefix_hash)
    Balancer->>Registry: Select endpoint (uses sticky key)
    Registry-->>Balancer: Endpoint list / routing strategy
    Balancer->>AIMock: Forward request to selected backend
    AIMock-->>Balancer: Response (includes BACKEND marker)
    Balancer-->>Olla: Sticky outcome (miss/hit) in context
    Olla-->>Tester: Response with X-Olla-Sticky-Session, X-Olla-Endpoint

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feature: Anthropic agent fixes and improvements #113: related Anthropic translator/passthrough and X-Olla-Mode header propagation.
feat: Sticky Sessions #133: related sticky-sessions feature touching balancer, handlers and routing behavior.

Suggested labels

bug

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 76.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'fix: sticky sessions' is vague and generic, using non-descriptive language that does not clearly convey what specific aspect of sticky sessions was fixed.	Consider revising the title to be more specific, such as 'fix: sticky sessions for prompt-only requests' or 'fix: sticky session affinity and TTL handling', to better communicate the primary changes.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/sticky-session-fixes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (2)

test/mock/compose.yaml (1)

21-21: Pin the AIMock image used by the test fixture to a specific version.

Using :latest makes the sticky-session harness depend on a mutable external image tag. A future AIMock push can change responses or health check behaviour without any code change, affecting test reproducibility. Pin all three services to a fixed version tag or digest.

This applies to lines 21, 37, and 52.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/mock/compose.yaml` at line 21, Replace the mutable AIMock image tag
"ghcr.io/copilotkit/aimock:latest" used in the compose test fixture with a fixed
immutable reference (specific version tag or digest) for all three AIMock
service entries (the occurrences of image: ghcr.io/copilotkit/aimock:latest), so
update each image line to a pinned tag or sha256 digest to ensure reproducible
tests.

makefile (1)

370-392: Use docker compose up --wait for cleaner service readiness checks.

The --wait and --wait-timeout flags on docker compose up are simpler than custom polling logic and handle health checks automatically. The current polling implementation works correctly with modern Docker Compose versions (v2.21.0+, which output newline-delimited JSON matching the parser logic), but replacing the loop with --wait improves readability and reduces complexity.

Suggested simplification

 mock-up:
 	`@echo` "Starting AIMock instances..."
-	`@docker` compose -f test/mock/compose.yaml up -d
-	`@echo` "Waiting for AIMock health checks..."
-	`@attempt`=0; \
-	while true; do \
-		attempt=$$((attempt+1)); \
-		if [ $$attempt -gt 40 ]; then \
-			echo "ERROR: AIMock did not become healthy after 40s"; \
-			docker compose -f test/mock/compose.yaml ps; \
-			exit 1; \
-		fi; \
-		unhealthy=$$(docker compose -f test/mock/compose.yaml ps --format json 2>/dev/null | \
-			python3 -c "import sys,json; data=sys.stdin.read(); items=[json.loads(l) for l in data.splitlines() if l.strip()]; print(sum(1 for i in items if i.get('Health','') not in ('healthy','')))" 2>/dev/null || echo "0"); \
-		total=$$(docker compose -f test/mock/compose.yaml ps -q 2>/dev/null | wc -l | tr -d ' '); \
-		healthy=$$(docker compose -f test/mock/compose.yaml ps --format json 2>/dev/null | \
-			python3 -c "import sys,json; data=sys.stdin.read(); items=[json.loads(l) for l in data.splitlines() if l.strip()]; print(sum(1 for i in items if i.get('Health','') == 'healthy'))" 2>/dev/null || echo "0"); \
-		if [ "$$healthy" = "3" ]; then \
-			echo "All 3 AIMock instances are healthy"; \
-			break; \
-		fi; \
-		sleep 1; \
-	done
+	`@docker` compose -f test/mock/compose.yaml up -d --wait --wait-timeout 40
+	`@echo` "All 3 AIMock instances are healthy"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@makefile` around lines 370 - 392, Replace the custom polling loop in the
mock-up make target with Docker Compose's built-in readiness flags: call docker
compose -f test/mock/compose.yaml up -d --wait --wait-timeout 40s (or your
preferred timeout) immediately after the "Starting AIMock instances..." echo,
and remove the while-loop that checks ps/Health/healthy and sleeps; keep the
existing echo messages but rely on the up --wait exit code to fail the target if
services do not become healthy.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/test-sticky-sessions.md:
- Around line 105-140: The two plain output/code fences showing the "Olla Sticky
Session — All Provider Routes Regression Test" header and the "X-Olla-*"
response lines need a language tag to satisfy markdown lint; update the opening
triple-backtick of those two blocks to ```text (the blocks containing the header
box "Olla Sticky Session — All Provider Routes Regression Test" and the later
block with "X-Olla-Sticky-Session", "X-Olla-Sticky-Key-Source",
"X-Olla-Endpoint") so the content remains unchanged but the fences are marked as
text; make the same change to the second occurrence that contains the X-Olla-*
lines.

In `@internal/adapter/balancer/sticky.go`:
- Around line 208-211: The fallback logic in sticky.go uses raw :=
gjson.GetBytes(body, "messages").Raw and only checks raw == "" so an empty
messages array ("[]") prevents falling back to prompt; update the condition to
treat empty JSON arrays (raw == "[]") and null (raw == "null") as missing as
well (e.g., if raw == "" || raw == "[]" || raw == "null" then set raw =
gjson.GetBytes(body, "prompt").Raw), ensuring the key is derived from prompt
when messages is an empty array; add a unit test covering a request with
{"messages": [], "prompt": "..."} to verify correct fallback behavior.

In `@internal/app/handlers/handler_common.go`:
- Around line 92-110: Update the comments for getProviderPrefix and
getRawProviderPrefix to accurately describe the returned prefix (no trailing
slash): change any examples or prose that say "/olla/<provider>/" to
"/olla/<provider>" and state explicitly that these helpers return the canonical
strip prefix without a trailing slash; ensure the doc for getRawProviderPrefix
still explains it preserves caller spelling but clarifies it returns
"/olla/<provider>" (no trailing slash) so path-stripping behavior is
unambiguous.

In `@test/scripts/sticky/test-sticky-provider-routes.sh`:
- Around line 141-157: The test currently compares backend markers but doesn't
assert they are non-empty, so if AIMock stops embedding BACKEND:instance-* the
comparison can pass silently; update the Turn 1 and Turn 2 checks (around
variables marker1 and marker2 produced by extract_backend_marker) to assert the
marker is non-empty before doing any equality comparison—i.e., after computing
marker1 and marker2 call a non-empty check (fail the test if empty) so fixture
regressions fail loudly, and then perform the existing “same backend marker”
assertion only when both markers are present.
- Around line 190-209: The diversity check currently treats empty or error
responses as a different backend because ep_div can be empty; modify the loop so
you only mark seen_other when the request succeeded (check http_code starts with
"2") and extract_header returned a non-empty X-Olla-Endpoint that differs from
ep1; update the condition that sets seen_other to require both a 2xx http_code
and ep_div != "" and ep_div != "$ep1" (refer to variables http_code, ep_div,
seen_other and function extract_header in the POST loop).

---

Nitpick comments:
In `@makefile`:
- Around line 370-392: Replace the custom polling loop in the mock-up make
target with Docker Compose's built-in readiness flags: call docker compose -f
test/mock/compose.yaml up -d --wait --wait-timeout 40s (or your preferred
timeout) immediately after the "Starting AIMock instances..." echo, and remove
the while-loop that checks ps/Health/healthy and sleeps; keep the existing echo
messages but rely on the up --wait exit code to fail the target if services do
not become healthy.

In `@test/mock/compose.yaml`:
- Line 21: Replace the mutable AIMock image tag
"ghcr.io/copilotkit/aimock:latest" used in the compose test fixture with a fixed
immutable reference (specific version tag or digest) for all three AIMock
service entries (the occurrences of image: ghcr.io/copilotkit/aimock:latest), so
update each image line to a pinned tag or sha256 digest to ensure reproducible
tests.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4f5a5a6e-d7fe-48c1-9b26-d0832c7ddb58

📥 Commits

Reviewing files that changed from the base of the PR and between ebed952 and 7985c7c.

📒 Files selected for processing (17)

.claude/skills/test-sticky-sessions.md
config/profiles/openai-compatible.yaml
internal/adapter/balancer/sticky.go
internal/adapter/balancer/sticky_metrics_test.go
internal/adapter/balancer/sticky_test.go
internal/app/handlers/handler_common.go
internal/app/handlers/handler_provider_common.go
internal/app/handlers/handler_provider_test.go
internal/app/services/discovery.go
makefile
test/manual/config.sticky.yaml
test/mock/compose.yaml
test/mock/fixtures/instance-a.json
test/mock/fixtures/instance-b.json
test/mock/fixtures/instance-c.json
test/scripts/sticky/run-manual.sh
test/scripts/sticky/test-sticky-provider-routes.sh

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/test-sticky-sessions.md:
- Around line 76-80: Update the Turn 3 documentation in
.claude/skills/test-sticky-sessions.md to match the actual test harness
behavior: change "10 fresh sessions" to "30 fresh sessions" and remove or
correct any statement that implies the main-proxy path is evaluated for
diversity (i.e., remove "shows main-proxy passing Turn 3" or explicitly document
that main-proxy is skipped for diversity). Ensure the description for Turn 3
notes that diversity is checked across 30 fresh sessions and that main-proxy is
excluded, and apply the same corrections to the other affected block referenced
(the section around lines 105-139).

In `@internal/adapter/balancer/sticky_metrics_test.go`:
- Around line 123-133: The test currently passes a nil request body so
ComputeStickyKey may not see the JSON messages array; attach the JSON payload to
the http.Request and set the Content-Type so the function actually parses
messages:[] and falls back to prompt. Concretely, update the test to set
req.Body (and Content-Type header) using the bytes from body before calling
ComputeStickyKey (the symbols involved are req, body and ComputeStickyKey with
cfg) so the assertion verifies fallback from messages to prompt.

In `@test/scripts/sticky/test-sticky-provider-routes.sh`:
- Around line 141-184: The test compares ep1 and ep2 without ensuring the
X-Olla-Endpoint header exists, so add an explicit presence check before the
equality assertion: after extracting ep1 and ep2 (variables ep1, ep2) verify [[
-n "$ep1" && -n "$ep2" ]] and only then perform the equality check [[ "$ep2" ==
"$ep1" ]] to pass "Turn 2 same endpoint"; otherwise call fail with a clear
message like "Turn 2 missing X-Olla-Endpoint header(s): ep1='...'/ep2='...'" so
missing headers don't silently pass as equal.
- Around line 126-132: The curl command substitution that sets http_code (the
curl invocation using --max-time "$CURL_TIMEOUT", -X POST, -H
"X-Olla-Session-ID: ${session_id}", -d "$body_json", "${OLLA_URL}${url_path}"
and writing to "$body_file" / "$headers_file") can abort the script under set -e
on timeout/failure; append || true to each curl command substitution so failures
don't exit the script and http_code can be inspected by the existing assertions
and fail()/cleanup logic; apply the same change to the other curl invocations
around the same tests (the blocks that set http_code at the other mentioned
ranges).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 26a94317-12fe-45b2-9f6b-4edd5e146a3c

📥 Commits

Reviewing files that changed from the base of the PR and between 7985c7c and f350dcd.

📒 Files selected for processing (7)

.claude/skills/test-sticky-sessions.md
internal/adapter/balancer/sticky.go
internal/adapter/balancer/sticky_metrics_test.go
internal/app/handlers/handler_common.go
makefile
test/mock/compose.yaml
test/scripts/sticky/test-sticky-provider-routes.sh

🚧 Files skipped from review as they are similar to previous changes (2)

test/mock/compose.yaml
internal/adapter/balancer/sticky.go

coderabbitai · 2026-04-23T03:16:41Z

+For each active (non-skipped) route, the script asserts:
+- Turn 1: `X-Olla-Sticky-Session: miss`, `X-Olla-Sticky-Key-Source: session_header`
+- Turn 2: `X-Olla-Sticky-Session: hit`, same `X-Olla-Endpoint` as Turn 1, same backend marker
+- Turn 3: across 10 fresh sessions, at least one lands on a different backend
+- Anthropic path additionally asserts `X-Olla-Mode: passthrough`


⚠️ Potential issue | 🟡 Minor

Align the documented Turn 3 behaviour with the script.

The harness sends 30 fresh sessions and skips main-proxy diversity, but the doc says 10 sessions and shows main-proxy passing Turn 3. This will mislead manual troubleshooting.

📝 Proposed fix

-- Turn 3: across 10 fresh sessions, at least one lands on a different backend +- Turn 3: across 30 fresh sessions, at least one lands on a different backend @@ ── main-proxy ── ✓ PASS — Turn 1 HTTP 200 ✓ PASS — Turn 1 sticky=miss ✓ PASS — Turn 1 key-source=session_header Pinned to: mock-compat-b (BACKEND:instance-b) ✓ PASS — Turn 2 HTTP 200 ✓ PASS — Turn 2 sticky=hit ✓ PASS — Turn 2 same endpoint (mock-compat-b) ✓ PASS — Turn 2 same backend marker (BACKEND:instance-b) - ✓ PASS — Turn 3 load balancing reaches multiple backends + SKIP Turn 3 diversity — main-proxy pool is huge and LCB tie-break is deterministic at zero connections — spread not meaningful here

Also applies to: 105-139

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.claude/skills/test-sticky-sessions.md around lines 76 - 80, Update the Turn 3 documentation in .claude/skills/test-sticky-sessions.md to match the actual test harness behavior: change "10 fresh sessions" to "30 fresh sessions" and remove or correct any statement that implies the main-proxy path is evaluated for diversity (i.e., remove "shows main-proxy passing Turn 3" or explicitly document that main-proxy is skipped for diversity). Ensure the description for Turn 3 notes that diversity is checked across 30 fresh sessions and that main-proxy is excluded, and apply the same corrections to the other affected block referenced (the section around lines 105-139).

coderabbitai · 2026-04-23T03:16:41Z

+	body := []byte(`{"model":"llama3","messages":[],"prompt":"hi","max_tokens":50}`)
+	req, _ := http.NewRequest(http.MethodPost, "/", nil)
+
+	cfg := config.StickySessionConfig{
+		KeySources:      []string{"prefix_hash"},
+		PrefixHashBytes: 512,
+	}
+	key, source := ComputeStickyKey(req, "llama3", cfg, body)
+
+	assert.Equal(t, "prefix_hash", source, "empty messages array must fall back to prompt via prefix_hash")
+	assert.NotEmpty(t, key, "prompt fallback must produce a non-empty key when messages is []")


⚠️ Potential issue | 🟡 Minor

Make this test prove messages:[] falls back to prompt.

The previous broken behaviour could hash the raw [] value and still return a non-empty prefix_hash key, so this assertion would pass without validating the fallback.

🧪 Proposed fix

- body := []byte(`{"model":"llama3","messages":[],"prompt":"hi","max_tokens":50}`) + body := []byte(`{"model":"llama3","messages":[],"prompt":"hi","max_tokens":50}`) + promptOnlyBody := []byte(`{"model":"llama3","prompt":"hi","max_tokens":50}`) req, _ := http.NewRequest(http.MethodPost, "/", nil) cfg := config.StickySessionConfig{ KeySources: []string{"prefix_hash"}, PrefixHashBytes: 512, } key, source := ComputeStickyKey(req, "llama3", cfg, body) + promptOnlyKey, promptOnlySource := ComputeStickyKey(req, "llama3", cfg, promptOnlyBody) assert.Equal(t, "prefix_hash", source, "empty messages array must fall back to prompt via prefix_hash") assert.NotEmpty(t, key, "prompt fallback must produce a non-empty key when messages is []") + assert.Equal(t, promptOnlySource, source) + assert.Equal(t, promptOnlyKey, key, "messages:[] must hash the prompt, not the empty messages array")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/adapter/balancer/sticky_metrics_test.go` around lines 123 - 133, The test currently passes a nil request body so ComputeStickyKey may not see the JSON messages array; attach the JSON payload to the http.Request and set the Content-Type so the function actually parses messages:[] and falls back to prompt. Concretely, update the test to set req.Body (and Content-Type header) using the bytes from body before calling ComputeStickyKey (the symbols involved are req, body and ComputeStickyKey with cfg) so the assertion verifies fallback from messages to prompt.

coderabbitai · 2026-04-23T03:16:41Z

+    http_code=$(curl -s -w "%{http_code}" -o "$body_file" -D "$headers_file" \
+        --max-time "$CURL_TIMEOUT" \
+        -X POST \
+        -H "Content-Type: application/json" \
+        -H "X-Olla-Session-ID: ${session_id}" \
+        -d "$body_json" \
+        "${OLLA_URL}${url_path}" 2>/dev/null)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/usr/bin/env bash set -euo pipefail if bash -c 'set -e; v=$(false); echo unreachable' >/dev/null 2>&1; then echo "unexpected: assignment survived a failing command substitution" exit 1 fi bash -c 'set -e; v=$(false || true); echo "survived with explicit fallback"'

Repository: thushan/olla

Length of output: 87

🏁 Script executed:

# Check file existence and line count wc -l test/scripts/sticky/test-sticky-provider-routes.sh # Verify the shebang and set options at the start head -20 test/scripts/sticky/test-sticky-provider-routes.sh # Check the specific lines mentioned in the comment sed -n '126,132p' test/scripts/sticky/test-sticky-provider-routes.sh # Check the second location sed -n '168,174p' test/scripts/sticky/test-sticky-provider-routes.sh # Check the third location sed -n '205,211p' test/scripts/sticky/test-sticky-provider-routes.sh

Repository: thushan/olla

Length of output: 1931

🏁 Script executed:

# Check for set -euo pipefail grep -n "set -euo pipefail" test/scripts/sticky/test-sticky-provider-routes.sh # Get lines 20-140 with line numbers to see the first curl block sed -n '20,140p' test/scripts/sticky/test-sticky-provider-routes.sh | cat -n | head -80 # Get lines 160-180 with line numbers for context sed -n '160,180p' test/scripts/sticky/test-sticky-provider-routes.sh | cat -n # Get lines 200-220 with line numbers sed -n '200,220p' test/scripts/sticky/test-sticky-provider-routes.sh | cat -n

Repository: thushan/olla

Length of output: 5316

🏁 Script executed:

# Get the exact lines 126-132 with context sed -n '120,135p' test/scripts/sticky/test-sticky-provider-routes.sh | cat -n # Verify all three curl locations have the same pattern (no || true) rg -n 'http_code=\$\(curl' test/scripts/sticky/test-sticky-provider-routes.sh

Repository: thushan/olla

Length of output: 1053

Prevent curl failures from aborting the script with set -e.

With set -e enabled, any curl timeout or connection failure in the command substitution will exit the script immediately, bypassing the fail() function, cleanup, and final test summary. Add || true to each curl call so failures are captured in http_code and handled by the existing assertions.

Proposed fix

- http_code=$(curl -s -w "%{http_code}" -o "$body_file" -D "$headers_file" \ + http_code=$(curl -s -w "%{http_code}" -o "$body_file" -D "$headers_file" \ --max-time "$CURL_TIMEOUT" \ -X POST \ -H "Content-Type: application/json" \ -H "X-Olla-Session-ID: ${session_id}" \ -d "$body_json" \ - "${OLLA_URL}${url_path}" 2>/dev/null) + "${OLLA_URL}${url_path}" 2>/dev/null || true) @@ - http_code=$(curl -s -w "%{http_code}" -o "$body_file" -D "$headers_file" \ + http_code=$(curl -s -w "%{http_code}" -o "$body_file" -D "$headers_file" \ --max-time "$CURL_TIMEOUT" \ -X POST \ -H "Content-Type: application/json" \ -H "X-Olla-Session-ID: ${session_id}" \ -d "$body_json" \ - "${OLLA_URL}${url_path}" 2>/dev/null) + "${OLLA_URL}${url_path}" 2>/dev/null || true) @@ - http_code=$(curl -s -w "%{http_code}" -o "$body_file" -D "$headers_file" \ + http_code=$(curl -s -w "%{http_code}" -o "$body_file" -D "$headers_file" \ --max-time "$CURL_TIMEOUT" \ -X POST \ -H "Content-Type: application/json" \ -H "X-Olla-Session-ID: ${new_session}" \ -d "$body_json" \ - "${OLLA_URL}${url_path}" 2>/dev/null) + "${OLLA_URL}${url_path}" 2>/dev/null || true)

Also applies to: 168-174, 205-211

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/scripts/sticky/test-sticky-provider-routes.sh` around lines 126 - 132, The curl command substitution that sets http_code (the curl invocation using --max-time "$CURL_TIMEOUT", -X POST, -H "X-Olla-Session-ID: ${session_id}", -d "$body_json", "${OLLA_URL}${url_path}" and writing to "$body_file" / "$headers_file") can abort the script under set -e on timeout/failure; append || true to each curl command substitution so failures don't exit the script and http_code can be inspected by the existing assertions and fail()/cleanup logic; apply the same change to the other curl invocations around the same tests (the blocks that set http_code at the other mentioned ranges).

coderabbitai · 2026-04-23T03:16:41Z

+    local sticky1 ep1 mode1 marker1
+    sticky1=$(extract_header "$headers_file" "X-Olla-Sticky-Session")
+    ep1=$(extract_header "$headers_file" "X-Olla-Endpoint")
+    mode1=$(extract_header "$headers_file" "X-Olla-Mode")
+    marker1=$(extract_backend_marker "$(cat "$body_file")")
+
+    if [[ -z "$marker1" ]]; then
+        fail "Turn 1 fixture marker missing — AIMock fixture regression"
+        rm -f "$headers_file" "$body_file"
+        return
+    fi
+
+    [[ "$sticky1" == "miss" ]] && pass "Turn 1 sticky=miss" || fail "Turn 1 sticky='${sticky1}' (expected miss)"
+
+    local key_src1
+    key_src1=$(extract_header "$headers_file" "X-Olla-Sticky-Key-Source")
+    [[ "$key_src1" == "session_header" ]] && pass "Turn 1 key-source=session_header" || fail "Turn 1 key-source='${key_src1}' (expected session_header)"
+
+    if [[ "$check_passthrough" == "true" ]]; then
+        [[ "$mode1" == "passthrough" ]] && pass "Turn 1 X-Olla-Mode=passthrough" || fail "Turn 1 X-Olla-Mode='${mode1}' (expected passthrough)"
+    fi
+
+    echo -e "  ${GREY}Pinned to: ${ep1} (${marker1})${RESET}"
+
+    # ── Turn 2: expect hit ────────────────────────────────────────────────────
+    : > "$headers_file"
+    : > "$body_file"
+    http_code=$(curl -s -w "%{http_code}" -o "$body_file" -D "$headers_file" \
+        --max-time "$CURL_TIMEOUT" \
+        -X POST \
+        -H "Content-Type: application/json" \
+        -H "X-Olla-Session-ID: ${session_id}" \
+        -d "$body_json" \
+        "${OLLA_URL}${url_path}" 2>/dev/null)
+
+    [[ "$http_code" =~ ^2 ]] && pass "Turn 2 HTTP ${http_code}" || fail "Turn 2 HTTP ${http_code} (expected 2xx)"
+
+    local sticky2 ep2 marker2
+    sticky2=$(extract_header "$headers_file" "X-Olla-Sticky-Session")
+    ep2=$(extract_header "$headers_file" "X-Olla-Endpoint")
+    marker2=$(extract_backend_marker "$(cat "$body_file")")
+
+    [[ "$sticky2" == "hit" ]] && pass "Turn 2 sticky=hit" || fail "Turn 2 sticky='${sticky2}' (expected hit)"
+    [[ "$ep2" == "$ep1" ]] && pass "Turn 2 same endpoint (${ep1})" || fail "Turn 2 endpoint changed: '${ep1}' → '${ep2}'"


⚠️ Potential issue | 🟡 Minor

Assert X-Olla-Endpoint is present before comparing it.

If both responses omit the endpoint header, Line 184 reports “same endpoint” with two empty values, so the test can miss a response-header regression.

🐛 Proposed fix

sticky1=$(extract_header "$headers_file" "X-Olla-Sticky-Session") ep1=$(extract_header "$headers_file" "X-Olla-Endpoint") mode1=$(extract_header "$headers_file" "X-Olla-Mode") marker1=$(extract_backend_marker "$(cat "$body_file")") + if [[ -z "$ep1" ]]; then + fail "Turn 1 X-Olla-Endpoint header missing" + rm -f "$headers_file" "$body_file" + return + fi + if [[ -z "$marker1" ]]; then fail "Turn 1 fixture marker missing — AIMock fixture regression" rm -f "$headers_file" "$body_file" return @@ sticky2=$(extract_header "$headers_file" "X-Olla-Sticky-Session") ep2=$(extract_header "$headers_file" "X-Olla-Endpoint") marker2=$(extract_backend_marker "$(cat "$body_file")") [[ "$sticky2" == "hit" ]] && pass "Turn 2 sticky=hit" || fail "Turn 2 sticky='${sticky2}' (expected hit)" - [[ "$ep2" == "$ep1" ]] && pass "Turn 2 same endpoint (${ep1})" || fail "Turn 2 endpoint changed: '${ep1}' → '${ep2}'" + if [[ -z "$ep2" ]]; then + fail "Turn 2 X-Olla-Endpoint header missing" + else + [[ "$ep2" == "$ep1" ]] && pass "Turn 2 same endpoint (${ep1})" || fail "Turn 2 endpoint changed: '${ep1}' → '${ep2}'" + fi

🧰 Tools

🪛 Shellcheck (0.11.0)

[info] 153-153: Note that A && B || C is not if-then-else. C may run when A is true.

(SC2015)

[info] 157-157: Note that A && B || C is not if-then-else. C may run when A is true.

(SC2015)

[info] 160-160: Note that A && B || C is not if-then-else. C may run when A is true.

(SC2015)

[info] 176-176: Note that A && B || C is not if-then-else. C may run when A is true.

(SC2015)

[info] 183-183: Note that A && B || C is not if-then-else. C may run when A is true.

(SC2015)

[info] 184-184: Note that A && B || C is not if-then-else. C may run when A is true.

(SC2015)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/scripts/sticky/test-sticky-provider-routes.sh` around lines 141 - 184, The test compares ep1 and ep2 without ensuring the X-Olla-Endpoint header exists, so add an explicit presence check before the equality assertion: after extracting ep1 and ep2 (variables ep1, ep2) verify [[ -n "$ep1" && -n "$ep2" ]] and only then perform the equality check [[ "$ep2" == "$ep1" ]] to pass "Turn 2 same endpoint"; otherwise call fail with a clear message like "Turn 2 missing X-Olla-Endpoint header(s): ep1='...'/ep2='...'" so missing headers don't silently pass as equal.

thushan added 4 commits April 23, 2026 11:56

revert anthropic_support default + fix stale comment

85495d5

fix sticky sessions for provider-scoped routes (#139)

3055480

add aimock test harness for sticky sessions

b900a20

stabilise sticky harness turn-3 diversity

7985c7c

coderabbitai Bot reviewed Apr 23, 2026

View reviewed changes

thushan added 7 commits April 23, 2026 12:26

fix prefix_hash fallback for empty messages array

a57501b

clarify provider prefix helpers return no trailing slash

45f9865

assert backend marker is non-empty in sticky tests

3fa1875

fix sticky diversity check to ignore failed requests

8510341

pin aimock image digest for reproducible tests

4ac53f2

use docker compose --wait instead of custom poll

ac2386f

add language tags to skill code fences

f350dcd

coderabbitai Bot reviewed Apr 23, 2026

View reviewed changes

thushan merged commit a500060 into main Apr 23, 2026
6 checks passed

thushan deleted the feature/sticky-session-fixes branch April 23, 2026 03:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: sticky sessions#140

fix: sticky sessions#140
thushan merged 11 commits intomainfrom
feature/sticky-session-fixes

thushan commented Apr 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 23, 2026 •

edited

Loading

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 23, 2026

Uh oh!

coderabbitai Bot Apr 23, 2026

Uh oh!

coderabbitai Bot Apr 23, 2026

Uh oh!

coderabbitai Bot Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

thushan commented Apr 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thushan commented Apr 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 23, 2026 •

edited

Loading